Clustering of Variables Based On a Probabilistic Approach Defined on the Hypersphere

نویسنده

  • Paulo Gomes
چکیده

We consider n individuals described by p standardized variables, represented by points of the surface of the unit hypersphere Sn-1. For a previous choice of n individuals we suppose that the set of observables variables comes from a mixture of bipolar Watson distribution defined on the hypersphere. EM and Dynamic Clusters algorithms are used for identification of such mixture. We obtain estimates of parameters for each Watson component and then a partition of the set of variables into homogeneous groups of variables. Additionally we will present a factor analysis model where unobservable factors are just the maximum likelihood estimators of Watson directional parameters, exactly the first principal component of data matrix associated to each group previously identified. Such alternative model it will yield us to directly interpretable solutions (simple structure), avoiding factors rotations. Keywords—Dynamic Clusters algorithm, EM algorithm, Factor analysis model, Hierarchical Clustering, Watson distribution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probabilistic Integrated Planning of Primary and Secondary Distribution Networks based on a Hybrid Heuristic and GA Approach

The integrated planning of distribution system reveals a complex and non-linear problem being integrated with integer and discontinues variables. Due to these technical and modeling complexities, many researchers tend to optimize the primary and secondary distribution networks individually which depreciates the accuracy of the results. Accordingly, the integrated planning of these networks is p...

متن کامل

Retaining Customers Using Clustering and Association Rules in Insurance Industry: A Case Study

This study clusters customers and finds the characteristics of different groups in a life insurance company in order to find a way for prediction of customer behavior based on payment. The approach is to use clustering and association rules based on CRISP-DM methodology in data mining. The researcher could classify customers of each policy in three different clusters, using association rules. A...

متن کامل

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

A Chance Constraint Approach to Multi Response Optimization Based on a Network Data Envelopment Analysis

In this paper, a novel approach for multi response optimization is presented. In the proposed approach, response variables in treatments combination occur with a certain probability. Moreover, we assume that each treatment has a network style. Because of the probabilistic nature of treatment combination, the proposed approach can compute the efficiency of each treatment under the desirable reli...

متن کامل

Customer behavior mining based on RFM model to improve the customer relationship management

Companies’ managers are very enthusiastic to extract the hidden and valuable knowledge from their organization data. Data mining is a new and well-known technique, which can be implemented on customers data and discover the hidden knowledge and information from customers' behaviors. Organizations use data mining to improve their customer relationship management processes. In this paper R, F, an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013